skip to main content


Search for: All records

Creators/Authors contains: "Horton, Nicholas J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Meng, X-L (Ed.)
    Many data science students and practitioners are reluctant to adopt good coding practices as long as the code ‘works.’ However, code standards are an important part of modern data science practice, and they play an essential role in the development of data acumen. Good coding practices lead to more reliable code and save more time than they cost, making them important even for beginners. We believe that principled coding is vital for quality data science practice. To effectively instill these practices within academic programs, instructors and programs need to begin establishing these practices early, to reinforce them often, and to hold themselves to a higher standard while guiding students. We describe key aspects of good coding practices for data science, illustrating with examples in R and in Python, though similar standards are applicable to other software environments. Practical coding guidelines are organized into a top ten list. 
    more » « less
  2. Abstract

    Text provides a compelling example of unstructured data that can be used to motivate and explore classification problems. Challenges arise regarding the representation of features of text and student linkage between text representations as character strings and identification of features that embed connections with underlying phenomena. In order to observe how students reason with text data in scenarios designed to elicit certain aspects of the domain, we employed a task‐based interview method using a structured protocol with six pairs of undergraduate students. Our goal was to shed light on students' understanding of text as data using a motivating task to classify headlines as “clickbait” or “news.” Three types of features (function, content, and form) surfaced, the majority from the first scenario. Our analysis of the interviews indicates that this sequence of activities engaged the participants in thinking at both the human‐perception level and the computer‐extraction level and conceptualizing connections between them.

     
    more » « less
  3. While coursework provides undergraduate data science students with some relevant analytic skills, many are not given the rich experiences with data and computing they need to be successful in the workplace. Additionally, students often have limited exposure to team-based data science and the principles and tools of collaboration that are encountered outside of school.

    In this paper, we describe the DSC-WAV program, an NSF-funded data science workforce development project in which teams of undergraduate sophomores and juniors work with a local non-profit organization on a data-focused problem. To help students develop a sense of agency and improve confidence in their technical and non-technical data science skills, the project promoted a team-based approach to data science, adopting several processes and tools intended to facilitate this collaboration.

    Evidence from the project evaluation, including participant survey and interview data, is presented to document the degree to which the project was successful in engaging students in team-based data science, and how the project changed the students' perceptions of their technical and non-technical skills. We also examine opportunities for improvement and offer insight to other data science educators who may want to implement a similar team-based approach to data science projects at their own institutions.

     
    more » « less
  4. null (Ed.)